The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification

نویسندگان

چکیده

Abstract In order to simplify sentences, several rewriting operations can be performed, such as replacing complex words per simpler synonyms, deleting unnecessary information, and splitting long sentences. Despite this multi-operation nature, evaluation of automatic simplification systems relies on metrics that moderately correlate with human judgments the simplicity achieved by executing specific (e.g., gain based lexical replacements). article, we investigate how well existing assess sentence-level simplifications where multiple may have been applied which, therefore, require more general judgments. For that, first collect a new reliable data set for evaluating correlation overall simplicity. Second, conduct meta-evaluation in Text Simplification, using our (and other data) analyze variation between metrics’ scores across three dimensions: perceived level, system type, references used computation. We show these aspects affect correlations and, particular, highlight limitations commonly operation-specific metrics. Finally, findings, propose recommendations simplifications, suggesting which compute interpret their scores.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic induction of rules for text simplification

Long and complicated sentences pose various problems to many state-of-the-art natural language technologies. We have been exploring methods to automatically transform such sentences as to make them simpler. These methods involve the use of a rule-based system, driven by the syntax of the text in the domain of interest. Hand-crafting rules for every domain is time-consuming and impractical. This...

متن کامل

Automatic Text Simplification for Spanish: Comparative Evaluation of Various Simplification Strategies

In this paper, we explore statistical machine translation (SMT) approaches to automatic text simplification (ATS) for Spanish. First, we compare the performances of the standard phrase-based (PB) and hierarchical (HIERO) SMT models in this specific task. In both cases, we build two models, one using the TS corpus with “light” simplifications and the other using the TS corpus with “heavy” simpli...

متن کامل

One Step Closer to Automatic Evaluation of Text Simplification Systems

This study explores the possibility of replacing the costly and time-consuming human evaluation of the grammaticality and meaning preservation of the output of text simplification (TS) systems with some automatic measures. The focus is on six widely used machine translation (MT) evaluation metrics and their correlation with human judgements of grammaticality and meaning preservation in text sni...

متن کامل

The C-Score ‐ Proposing a Reading Comprehension Metrics as a Common Evaluation Measure for Text Simplification

This article addresses the lack of common approaches for text simplification evaluation, by presenting the first attempt for a common evaluation metrics. The article proposes reading comprehension evaluation as a method for evaluating the results of Text Simplification (TS). An experiment, as an example application of the evaluation method, as well as three formulae to quantify reading comprehe...

متن کامل

Automatic Metrics for Genre-specific Text Quality

To date, researchers have proposed different ways to compute the readability and coherence of a text using a variety of lexical, syntax, entity and discourse properties. But these metrics have not been defined with special relevance to any particular genre but rather proposed as general indicators of writing quality. In this thesis, we propose and evaluate novel text quality metrics that utiliz...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computational Linguistics

سال: 2021

ISSN: ['1530-9312', '0891-2017']

DOI: https://doi.org/10.1162/coli_a_00418